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Abstract. The phase of the short-time Fourier transform (STFT) 
is sometimes considered difficult to interpret. However, the phase 
information is important for improved analysis and processing. 
The phase derivative, in particular, is essential for the reassignment 
method or the phase vocoder algorithm. In order to understand 
the phase derivative of the STFT more thoroughly, we describe an 
interesting phenomenon, a recurring pattern in the neighborhood 
of zeros. Contrary to the possible expectation of an arbitrary be- 
havior, the phase derivative always shows a singularity with the 
same characteristic shape with a negative and a positive peak of 
infinite height at these points. We show this behavior in a nu- 
merical investigation, present a simple explicit analytic example 
and then do a complete analytical treatment. For this we present 
several intermediate results about the regularity of the STFT for 
Schwartz windows, which are of independent interest. 



1. Introduction 

The short-time Fourier transform (STFT) jUdl] is a time-frequency 
representation widely used in audio signal processing. A common def- 
inition of the STFT is 



(1) V(f,g)(x,u>) = j f(t)g(t-x)e- M dt. 

The STFT V(f, g)(x, to) provides information about the frequency con- 
tent of the signal / at time x and frequency u;J3 The analyzing window 
g determines the resolution in time and frequency. 
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The interpretation of the modulus of the STFT is relatively easy, con- 
sidering the fact that the spectrogram (defined as the square absolute 
value of the STFT) can be interpreted as a time-frequency distribution 
of the signal energy. This interpretation led to the important success of 
the STFT in signal processing. In particular, it has been widely used 
for applications in speech processing and acoustics as a graphical tool 
for signal analysis [20] . 

But the interpretation of the phase of the STFT is less obvious, 
and is often not considered in applications. In most analysis/synthesis 
schemes that modify the STFT, the magnitude is modified, but the 
phase is not changed. This is a problem, as it is known, that amplitude 
and phase for the STFT are not independent, but instead can even 
carry the same information. For Gaussian windows this can be found 
in [12] . So a modification of the amplitude itself, without controlling 
the effect on the phase, will have strange results. Furthermore the 
phase derivative, as explained below, is interesting by itself for certain 
applications. 

In digital image processing it is well known that the phase informa- 
tion of the discrete Fourier transform is at least as important as the 
amplitude information. In [19] it is shown that as long as the phase of 
the discrete Fourier transform of an image is retained and the ampli- 
tude is set to 1, the image can still be recognized. If the phase is set 
to zero, but the amplitude is retained, the image can not be discerned. 
In the same article it is stated that the intelligibility of a sentence 
is retained if the phase of the discrete Fourier transform is combined 
with unitary magnitude, for a concrete example see [3]. So the phase 
information can be considered essential for reconstruction. Therefore 
it is also pivotal for applications modifying the STFT coefficients. So 
for this type of applications, in particular for applications using STFT 
or Gabor frame multipliers [HI 12] which motivated the present study, 
better understanding of the structure of the phase is necessary to im- 
prove the processing possibilities. It is known that a multiplier has a 
'local effect' in the time-frequency plane, in the sense of a small time- 
frequency spread [16]. But due to the uncertainty principle, it can never 
be perfectly localized. This contradicts the intuitive approach to a mul- 
tiplier, where the multiplication would just correspond to amplification 
or attenuation of single time-frequency components. As a particularly 
interesting consequence of this phenomenon a time-frequency shift of 
a signal could be realized by a complex multiplier, which manipulates 
the phase. To control this behavior and investigate how to exploit it, 
for example in an optimization of the effect of a multiplier by manip- 
ulating the phase, a thorough understanding of the effect of the phase 
is essential. 
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It is known |3] that the phase of the DFT becomes arbitrary near 
zeros, see also [3J. So it could be expected that the STFT shows a sim- 
ilar behavior. Interestingly, in this paper we observe that the behavior 
of the phase derivative around zeros is far from being arbitrary. The 
over-complete representation of the STFT and the resulting reproduc- 
ing kernel property is in contrast to the basis property of the DFT. 
This difference, however, leads also to the afore-mentioned difference 
in the phase. 

The phase of the STFT, also called localized phase [23], is sometimes 
considered as unstable to noise. This is mostly based on the fact that 
the reconstruction from the noisy Fourier phase is unstable [TS]. But, 
again, the phase of the STFT respectively its derivative can make per- 
fect sense. For example in [TJ] reassignment is applied for Gaussian 
noise. Reassignment there leads to zeros surrounded by thin, froth-like 
structures. The STFT of such a noise is structured and correlated, 
which is again an effect of the over-completeness of the STFT. 

In this paper we investigate the behavior of the phase derivative of 
the STFT around zeros. At these positions a certain structure can 
be observes, a positive and a negative peak are coupled. In Section 
[2] we will give an introduction to the phase derivative of the STFT. 
In Section [3J we present numerical experiments and observations about 
this behavior. Note that those experiments have been presented at 
a conference [T3]. In section H] we give a very simple but illustrative 
example where the pertinent analytic calculations can be performed 
explicitly. In Section |5] we give a mathematical treatment of this be- 
havior and present a complete analytic explanation (for a certain large 
class of windows). We show that the described phenomenon appears, 
whenever the STFTs satisfies certain differentiability condition. In the 
course of the proof we also derive a general smoothness result of the 
STFT for Schwartz windows. 

All experiments in this paper have been done using the Linear Time- 
Frequency Analysis Toolbox (LTFAT), as the necessary functionality 
is a core part of LTFAT, [22]. LTFAT is freely available for download 
on the Internet. 



2. The Phase Derivative of the STFT 

The STFT given by ([!]) is called the frequency-invariant STFT. In 
this case, if the input signal is modulated by a complex exponential, 
then the output coefficients are unaltered, just shifted in frequency. 
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Another definition of the STFT is also commonly used; this definition 
differs only in the phase of the STFT, the magnitude is unchanged: 



This convention is called time-invariant, because a shift in time of the 
input signal will also shift the STFT coefficients without altering them. 
This variant is convenient for implementations, for example when using 
the FFT [24J . The two conventions are thus related by a simple phase 
term following the relation W(f,g)(x,u) = e 2ntuJX V(f,g)(x,u). 

The phase of the STFT is usually not considered directly. In fact, it is 
more interesting to consider the phase derivative over time or frequency. 
Indeed, these quantities appear naturally in the context of reassignment 
[To"| [TJ and manipulations of phase derivative over time is the idea 
behind the phase vocoder [TU], [8] for time stretching and compression. 
Their interpretation is easier, as the derivative of phase over time can 
be interpreted as local instantaneous frequency while the derivative 
of the phase over frequency can be interpreted as a local group delay. 
While there are other methods to estimate the instantaneous frequency 
[TTt [6] we stick to the frequently used derivative of the phase. 

To numerically compute the local instantaneous frequency, an un- 
wrapping of the phase, i.e. extending the function onto the torus [0, 2ir) 
to a continuous function on R, is needed to avoid discontinuities, see 
also Proposition 15.11 This is the classical method used in jSJ [TS] . An- 
other method was found in pQ: 



with g'(t) — -£{t)- A similar approach is possible for the derivative 
over frequency. The benefit of this method is that is does not require 
unwrapping, instead the phase derivative is computed by pointwise 
operations using a second STFT based on the derivative of the window. 

In applications often a subsampled version of the STFT is used. This 
leads to a subsampled version of the phase, which introduces aliasing 
effects in the phase values. This constitutes a difficulty concerning 
the interpretation the phase. However, as ([3]) shows, it is possible 
to compute a subsampled version of the phase gradient without ever 
having to compute a full STFT. This is another reason for considering 
the phase gradient instead of directly considering the phase. 

The phase derivative over time is of particular interest for analysis 
of signals containing sinusoidal components, as often encountered in 
acoustics [8]. In [1] it is shown how the local instantaneous frequency 
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gives access to the exact centre frequency of a slowly changing sinusoid, 
despite the usual spread in time and frequency normally produced by 
the STFT. 



3. Numerical Observations 



Let us investigate the derivative of the phase of the STFT of noise. 
We will see that around zeros the phase derivative shows the mentioned, 
interesting behavior. 

For noise, only statistical properties of the phase are accessible. Some 
interesting results for the phase derivative have been shown in the 
context of reassignment. In particular the study of the distribution of 
the phase derivative values appear in [5]. There, the following result is 
given. We consider a zero-mean Gaussian analytic white noise / such 
that its expectation is given by 

2 

(4) E[He(/(t)) ■ Re(/(*))] = E[Im(/(t)) ■ Im(/( S ))] = ^5(t - s) 

and E[f(t)f(s)] = for any (t,s) G 1R 2 , with its real and imaginary 
parts a Hilbert transform pair. Using a Gaussian window given by 

t 2 

g(t) = e , the phase derivative over time of V(f,g) is a random 
variable with distribution of the form: 

(5) p(v) 



2(l + v 2 )l 

This distribution is shown in Figure [TJ As can be seen, it is a quite 
"peaky" distribution, indicating that the values of the phase derivative 
are mainly values close to zero, with some rare values with higher 
absolute values. 
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Figure 1 . Distribution of the values of the phase deriv- 
ative over time of the STFT with Gaussian window for 
a white Gaussian noise (with variance 1). 
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But the information about the distribution of the values of the phase 
derivative does not give any clue about the spatial distribution of these 
values in the time-frequency plane. Because the STFT of noise is not 
uncorrelated any more [I], this would be an important information. 
As accessing this information theoretically seems particularly difficult, 
we conducted systematic numerical experiments to study this spatial 
distribution. 

For this, we need to compute the derivative of the phase in discrete 
settings. We used the expression (jSJ) to compute the phase derivative. 

Using this formula we will face numerical difficulties when the de- 
nominator V(f,g)(x,u) is close to zero. But using double precision, 
these problems appear only for really small values of the modulus (on 
the order of 10~ 13 ), which allows us to reliably observe the values of 
the phase derivative even close to the zeros of the STFT. In the figures 
of this paper, the phase derivative values are ignored and represented 
as white at the points where the value of the modulus is too small. 

The results of our experiments are illustrated by Figure [2j As can 
be seen on this figure, the time-frequency distribution of the values are 
highly structured as in [12]. In particular, the values of the phase deriv- 
ative with high absolute values are concentrated around several time- 
frequency points, which can be identified as the zeros of the transform 
when looking at the modulus. Furthermore, the shape of the phase 
derivative seems to be very similar in the neighbourhood of the zeros, 
with a typical pattern repeating at each zero. This typical pattern is 
represented on the third image of Figure [2j When going from low to 
high frequencies, it presents a negative peak followed by a positive one. 

This phenomenon is related to the fact that the STFT of white noise 
is a correlated process. This correlation is determined by the reproduc- 
ing kernel of the transform (see part 6.2.1 of [1]). It is thus interesting 
to study the influence of the window choice on the observed structure 
of the phase derivative. It could be expected that this behavior de- 
pends a lot on the Gaussian window and the fact that, in this case, the 
STFT is related to the Bargman transform [T5], which is an analytic 
function. 

The influence of the window is illustrated in Figure [3j Comparing 
Figure [2] and the first display of Figure [31 we can observe the effect 
of scaling the window (or change of window length). We see that 
narrowing the window results in similar patterns around the zeros, but 
with a scaled shape: the resulting pattern is narrower over time, but 
wider over frequency. 
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Figure 2. Observation for a Gaussian white noise, us- 
ing a Gaussian window. Top: modulus of the STFT. 
Bottom-left: derivative over time of the phase of the 
STFT using the definition (pQ). Bottom-right: mesh plot 
of the derivative over time of the phase in the neighbour- 
hood of a zero of the STFT. 



In Figure E] we also see that for windows with a worse time- frequency 
concentration than the Gaussian window, the structure is more compli- 
cated. In the representation using a Hamming window, we still observe 
repeating patterns at the zeros of the transform, but the variability of 
the shape of this pattern seems higher, and the pattern orientation 
slightly varies, whereas it is fixed in the case of a Gaussian window. 

For the case of the rectangular window, the zeros of the STFT form 
not discrete points as for the two other types of window but, rather, 
more complicated, extended structures. This leads to much more vari- 
able patterns. Yet, interestingly, we still observe that the values of 
the phase derivative with high absolute values concentrate around the 
zeros of the transform, whereas the phase derivative is close to zero in 
the regions of the STFT where the modulus is high. The seemingly 
different pattern in the case of a rectangular window appears to be 
created by artifacts due to the discontinuity of the window. But in 
principle the behavior of the phase derivative around zeros is the same 
as in the other cases. 
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Figure 3. Influence of the window when analyzing a 
Gaussian white noise. For three different windows, on 
the left, modulus of the STFT, on the right, derivative 
over time of the phase of the STFT using the definition 
dl]). From top to bottom, the windows are: a narrower 
Gaussian window, a Hamming window, a rectangular 
window. 



In summary, the effect could be seen not only for the Gaussian, but 
also for the Hanning window. Therefore we can expect that we need 
some regularity of the window to observe this effect, but we don't need 
the full regularity of the Gaussian window. This can be nicely seen in 
the mathematical results in Section |5j Numerically it can be observed, 
that even for discontinuous windows like the rectangular window, a 
similar structure can be seen, but perturbed by the missing regularity 
of the window. 
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The behavior that we observe for the noise is not specific to this kind 
of signal. Indeed further experiments on other synthesized and recorded 
complex sounds showed that the same characteristics can be observed 
for all signals: the values of the phase derivative of high absolute value 
are concentrated in the neighbourhood of the zeros of the STFT, and 
for "nice" windows, a specific pattern appears in this neighbourhood. 



4. A Simple Explicit Analytic Example 

In this section we give a simple analytical example for which we can 
compute the phase derivative and show the phenomenon around zeros 
explicitly. 

Considering the signal given by 

(6) f(t) = e 2niullt + e 2niul2t 

and using a Gaussian window g(t) = e ''z?, we can explicitly compute 
the expression of the STFT, which results in the formula: 

(7) W{f,g){x,u) = e 2 ™^ e - 27rfj2(w ~ Wl)2 + e ^^ e ~^^~^)\ 

The zeros of this STFT are the points of coordinates u m ) in the 
time-frequency plane, with co m = and Xk = ol 1+2k \ for fceZ. 

The expression of the phase derivative for this signal, given in part 
VI- 12 of 0, is: 

(8) 

d , , w w r, f r, w \ 1 + tan 2 (27r5a:) 
^ t (W(}, g )M) = + < *" h W 1 + tm . (aril)tM j 1 . ( .; 

with 5 = ^± and s = Aira 2 (u - u m )S. 

The plot of this function around one of the zeros is visible in Figure HI 
We see the pattern that was already observed in the previous section. 



5. Mathematical Treatment 



Here we now show the analytical background of the observed behav- 
ior. We decided to include detailed versions of the proofs and results 
to be as self-contained as possible. 
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Figure 4. Observation for the signal defined in (E]) with 
u>i = 500 and 002 = 1500 Hz. Top: modulus of the STFT. 
Bottom: derivative over time of the phase of the STFT 
according to fl8]) represented as an image (left) and as a 
mesh (right). 

5.1. The Derivative of Amplitude and Phase. We start our rig- 
orous analytic treatment by outlining the mathematical foundations of 
the notions of phase and phase derivative of a complex-valued function. 

In this section to shorten notation, let / : U C C — > C, where U 
is an open subset of C. Denote by N = {z G U \f(z) = 0} the set of 
zeros of f. Furthermore for a function q we use the notation q r . = 4 s -- 

The first proposition, whose straightforward proof we omit, serves 
mainly the purpose of fixing notation and terminology. 

Proposition 5.1. Let f e C^^M 2 ). Let z G U\N. Then there exists 
a connected open neighbourhood V C U \ N of z and a differentiable 
function 4> = (f)(x,y), <$> : V — > R (considered as a function of the two 
real variables x and y) such that f(z) = \f(z) \ ■ e 1 '^ = r(x, y) ■ e l '^ x ' y ^ 
for all z = x + iy G V . 

Any such function (j) is called a (differentiable) phase (function). In 
signal processing this is often called the unwrapped phase. 
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Observe that two such functions differ by a constant multiple of 
2-7T, i.e. if ip : V — > R is another differentiable function such that 
f(z) = \f(z)\ ■ e*^ for all z G V, then there is a k G Z (independent 
of z) with 0(2;) = ^(z) + k ■ 2n for all z G V. 

For convenience, we introduce the following notation: 
The negative imaginary axis {z = x + iy G C : x = 0, y < 0} will be 
denoted by Y~, the positive imaginary axis {z = x + iy G C : x = 
0,y>0} by y+. 

One choice for the argument of a complex number, ip = arg (z) 
is given explicitly in the following proposition. Note that this gives 
differentiable phase functions for the identity. In the following result 
the derivative of i/j is of particular interest. 

Proposition 5.2. Let z = x + iy G C be a complex number, given in 
Cartesian coordinates. Then z = re 1 ^ with r = \z\, and 



if; = if)(x,y) 



arctan -, if x > 

arctan - + ir, if x < 

if x = and y > 
if x = and y < 



2 ' 

7T 

< 2 



VFe have i/j G C°°(1R 2 \ Y T/ie partial derivatives of first order 

are given by 

dxip(x,y) 



x 2 + y 2 



and 



for (x, y) & Y- 



d 2 ip(x,y) 



x 2 + y 2 



Proof. Considering the geometric interpretation of the phase of a com- 
plex number z as the angle between the radius vector from the origin 
to z and the positive real axis, the formula is quite clear. 
For (x, y) ^ Y + U Y~, ip is obviously differentiable with partial deriva- 
tives 

diip(x,y) = — ( arctan -(+7r)) = V 
ax \ x / + ?/ z 

and 

d 2 ip(x,y) = ^- (arctan -(+tt)) = 7 . 
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Now assume (x,y) G Y + \ (0,0). Then 



i/j(x + h,y) - ip(x,y) 
h 



arctan ■ 



7T 

2 



h 



y n 

arctan — + n 

h 2 



h 



if h > 



if h < 0. 



Let /i J, 0. Then, for the first expression, numerator and denomi- 
nator are both different iable with respect to h and go to zero, thus 
L'Hospital's Rule applies and yields 



V 71 

arctan — L 'Hos P 

lim — = sp ' lim 



HO 



h 



Ho h 2 + y 1 



1 

V 



The case h "f yields the same result. Thus the above formula for diip 

also holds for (x, y) G Y + \ (0, 0), i.e. x = and y > 0. 

The other partial derivative can be treated analogously for (x, y) G 

^ + \(0,0). 

Since the partial derivatives of first order are obviously infinitely dif- 
ferentiate functions (on R 2 \ (0, 0)), we have ip G C°°(M 2 \ F _ ,R). □ 



The proposition allows us to give a first explicit formula for the 
partial derivatives of the phase of a complex function. 

Proposition 5.3 (Phase derivatives of a complex function). Let f G 

C 1 (6 r , M 2 ) be continuously differentiable with f(z) = u{z) + i ■ v(z). 
Then the phase derivatives are given by 

dtp = u(x, y) ■ v x (x, y) - v(x, y) ■ u x (x, y) 
dx ' u 2 (x, y) + v 2 (x, y) 

^ = u(x, y) ■ v y (x, y) - v(x, y) ■ Uyjx, y) 
dy ' u 2 (x, y) + v 2 (x, y) 

for all z = x + iy ^ N ( independent of the actual choice of differential 
phase function). 



Proof. Let z G U \ N (i.e. f(z) ^ 0). Since any two phase functions 
in a neighbourhood of z differ only by a constant, it is clear that the 
partial derivatives are independent of the actual choice of (f). 
Denote by ip the function in the preceding proposition. 
Assume z = x + iy G C such that f(z) (jL Y~ . Then 



(f)(z) = <j)(x,y) = (ipof) (x,y) 
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is a differentiable phase function in a neighborhood of z. 
If f(z) G replace the function by 

arctan -, if x > 

arctan - — 7r, if a; < 

if x = and ?/ > 
if x = and y < 0. 

It is not hard to see that this function is in C°°(IR 2 \ with the 



ip = 4)(x,y) 



2 ' 

7T 

k 2 ' 



same first partial derivatives diifj(x, y) 
for (x, y) G" Y + . Then set 



-y 



x 2 -\-y* 



and d 2 ip(x, y) 



x 2 +y 2 



4>{z) = <f>(x, y) = o fj (x, y). 
Again this gives a differentiable phase function in a neighbourhood of 



In both cases, the chain rule yields the stated formulas: 
d(f) ~v(x,y) u(x,y) 

ir x ' y > = —( — s ; 2i — v ■ dxU ( x ' y) + —( — s ; 2i — t 

ox u z {x, y) + v z (x, y) u z {x,y)+ir{x,y) 

_ u(x, y) ■ d x v(x, y) - v{x, y) ■ d x u(x, y) 
m 2 (x, y) + v 2 (x, y) 

and analogously for 



d x v(x, y) 



□ 



5.2. Regularity Properties of the STFT. In this section, we will 
give a regularity and smoothness result for the STFT with Schwartz 
window. As a rule of thumb, in general the STFT is the smoother the 
smoother the window function is. This principle is well reflected by the 
result of this section. 

We prove a smoothness result for arbitrary windows in the Schwartz 
class [21] of infinitely differentiable rapidly decaying functions 

S(R) = {(/)£ C°°(M,C) : sup\t a — <p(t)\ < oo for all a, p G N }. 

i£]R dtP 

Although the result may be considered mathematical folklore, to our 
knowledge it has not been stated and proved in the literature so far. 
To formulate the Theorem 15.61 we gi ye a f ew preliminary definitions: 

Definition 5.4 (Differentiation and multiplication operator). Let g G 
«S(R) be a Schwartz function. Denote by 

D : S(R) ->■ S(R), g{t) ^ Dg(t) := g'{t) and 
M : S(R) ->■ S(R), g(t) h> Mg(t) := t ■ g(t) 

the differentiation resp. multiplication operator on S(R). 
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It can be easily shown that the differentiation and multiplication 
operator satisfy the commutation relation 

(9) DM - MD = Id. 

Definition 5.5 (Translation and difference operator). Let g G S(R) 
and /i el. The operator 

T h : S(R) -> S(R), g(t) ^ T h g(t) := - h) 

is called translation operator (with shift parameter h). If h ^ 0, i/ien 
we also define 

D h : S(R) -> 5(R), <#) ^ A^(t) := g( * + ^ ~ ^ , 
i/ie difference operator. Thus 

T^ h - Id 



A 



h 



Again, it is easy to show that translation and difference quotient 
satisfy the following commutation relations for all h G R resp. h G 

R\{oy. 

(10) DT ft = T h D, MT h = T h M + hT h 
and 

(11) = D h D, MD h = D h M - T_ h . 

Theorem 5.6 (Smoothness of the STFT with Schwartz window). Let 

f G L 2 (IR) and g G «S(R). Then the short-time Fourier transform of 
f with window g is infinitely differentiable, i.e. V(f,g) G C°°(1R 2 ,M 2 ). 
The partial derivatives are given by 

V x (f,g)(x,u) = V(f,-Dg)(x,u) 

and 

V u {f, 9) (x, co) = -2ttz (x • V(f, g) (x, u) + V(f, Mg) (x, u)) . 



To prove Theorem 15. 6[ we give a preliminary lemma. 
Lemma 5.7 (Difference quotient in «S(R)). Let g G «S(R). Then 

lim D h g = Dg 

converges in the topology of S(R). 



Proof. Let a, (3 G No. We have to show that 
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for h->0, for all g G S(R). 

We start by proving the following explicit formula: 

M a D /3 (D h — D) — (D h - D)M a D 13 - a(T_ h - Id)M a ~ l D 

a-1 



P 



7=0 

(note that for a = and a = 1 the right hand side is understood to 
reduce to (D h - D)D 13 and (D h - D)MD 13 - (T_ h - Id)D^ respectively, 
since in this cases the sum collapses). 

In order to see this, observe first that D^[Dh — D) = (Dh — D)D? for 
all f3 G No, since D and Dh commute by ffTU]) . This settles the case 
a = 0. 

Next, assume a — 1. Then 

M(D h — D) = MD h - MD = (D h - D)M - (71 h - Id) 
by (191) and (jlDjl. Thus 

MD /3 (D h — D) = M(D h - D)D P = (D h - D)MD> 3 - (T_ ft - Id)D p . 

This is the case a — 1. 
Finally, assume a > 2. Then 

M a D = DM a - aM a ~ x 

using ([§]) and a simple mathematical induction, hence 

(*) M a (D h — D) = M a D h - M a D = M a D h - DM a + aM a ~ l . 

For the first term on the right hand side, we compute 

rri T A 1 

M a D h = M a —±j- — = -(M a T_ h - M a ). 
Furthermore, for g G «S(R) we have, using the binomial theorem, 

M a T_ h g(t) = (t + h- h) a g{t + h) = J2 ( a ) (-h) a ^T- h M~<g(t), 

7=0 

thus 

M a T_ h = T„ h {M a - ahM a ~ l + ^ ( j i.- h T~ 1M1 )- 

7=0 ^ ' 
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So 

I I - — - / m \ 

(-/i) a " 7 M 7 - M c 



-/i) a " 7 M 7 ) 



M a D h = ~ ^F_ h |^M Q - fiftAf" 1 + ^ Q 

Q!-2 X s 

= D h M a - aT_ h M a ~ l -T- h {J2[) (-^) (a_1)_7 M 7 ) . 

7=0 

Inserting this into (*) yields the stated formula. 

Now let / G S{R) be an arbitrary Schwartz function. By the Mean 
Value Theorem we have 

IK^-^/lloo^lPVlloo-W. 

Likewise 

Finally observe that 

\\T-hf\\oo = 



Putting all this together and using the explicit formula from above, we 
find at last for arbitrary g G S(M.) and a, f3 G No 

\\M a D^{D h g-Dg)\\ 00 

< \\(D h - D)M a D^g\\ 00 + \a\ ■ ||(T_ h - ld)M a ~ x D fi g\ U 



+ E( a )i /i i (a " 1K7 ii M7 ^iic 

-v— n VT/ 



I oo 

7=0 ^ ' 

< \\D 2 (M a D^g)\\ oc ■ \h\ + \a\ ■ \\D(M a ~ 1 D ,3 g)\\ 00 ■ \h\ 

+ (E(")i /i i (a_2) " 7 ii M7 ^n-)-i /i i 

7=0 ^ 

— > 

for h — >■ as <? G <S(R). This finishes the proof of the theorem. □ 

We can now prove the result regarding the smoothness of the STFT 
with a Schwartz window: 

Proof of Theorem \5.b\ Let h ^ 0. Then, for fixed (x,u) G M 2 , 

\ (V(f, g)(x + h,u>)- V(f, g)(x, u)) = U(f, M u T x+h g) - (/, M w T x g)) 
n h 
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By Theorem 15. 7\ —D^g — > — Dg for h — > with convergence in the 
topology of <S(R). Since «S(R) L 2 (IR) is continuously embedded, 
a fortiori —D^g — > —Dg in L 2 (IR). Thus the partial derivative with 
respect to the first variable 

V x (f,g)(x,oj) = lim (f,M u T x (-D h g)) = V(f, -Dg)(x,oo) 
exists and is continuous. 

Now note, that by Lemma 3.1.1 in [13] 

V(f,g)(x,u) = e- 2 ™" V(f,g)(oj,-x). 

Therefore 

V u (f,9){*,u) = ^{e- 2mx "V(f,g)(u,-x)) 

= -2 m xe- 2 ™"V(lg)(u, -x) + e - 2 ™^ 1 V(/, g)(u, -x) 

Using the above formula for d\V and the fact that — Dg = Mg yields 
that 

VM, 9){x, oo) = -2m (xV(f, g)(x, u) + V(f, Mg)(x, u)) 

and therefore exists and is continuous. Hence V(f,g) G C 1 (M 2 ,M 2 ). 
But then V x (f,g) = V(f,-Dg) and V u {f,g) = -2ni(x ■ V(f,g) + 
V(f, Mg)) are again continuously different iable, since Dg, Mg £ S(M) 
as well. By mathematical induction, V(f,g) E C°°(IR 2 , 1R 2 ). □ 

5.3. The Derivative of the Phase Around the Zeros of the 

STFT. In this section we present an analytic explanation of the pe- 
culiar behaviour of the phase derivatives of the STFT for a large class 
of window functions. It turns out that the phenomenon is connected 
to the smoothness and continuous differentiability of the STFT which, 
as we have seen in the previous paragraph, is in turn derived from the 
smoothness of the window function. 

We prove results for STFTs of a certain smoothness, more precisely 
of certain differentiability orders. STFTs with windows in the Schwartz 
class are infinitely differentiable and thus contained as special case. 

Consider first the partial derivative of the phase of the STFT with 
respect to the first variable (i.e. 'time'). For convergence along a 
vertical path, we have 

Theorem 5.8 (Phase derivatives of the STFT, part I). Let f,g £ 
L 2 (M). Assume that 



V(f, g) = V = U + i-We C 2 (R 2 , M 2 ) 
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V(x ,u ) = 

det Jv{xq, oj ) 7^ 0, where 



J v (x ,u ) 



U x (x ,u ) U u (x ,u) ) 

W x (x 0) u ) W u (x ,uj ) / 
denotes the Jacobian matrix of V at the point (xq,uq). 



Denote ip (x, uj) = arg (V(f, g) (x, uj)). Then the phase derivative of the 
STFT satisfies 

difj J — oo, if co t ujq from below 

hm — (x ,w) = < 

w^wo ox I +oo, a; \. Uq from above 

for det Jy (x , Wo) > 0, 
respectiv ely 

,. 9^ / \ I +oo, if u) 1 Uq from below 
hm — — (xojW) = < 

w->-£jo cte I — oo, !/w|w /rom akwe 

/or det Jy(xo, wq) < 0. 



Proof. Assume without loss of generality that det Jy(x , uj ) < 0. 
By Theorem 15.31 

dif) , . _ [/(xo,q;) ■ W x (x ,u) - W(x ,u) ■ U x (x ,uj) 
dx [Xo,U) ~ U 2 (x ,u) + W 2 (x ,u) 

Since W x and U x are continuous and thus remain bounded in a neigh- 
bourhood of (xo,ojq), both numerator and denominator tend to zero 
for uj — > ojo- However, both functions are differentiable, since V G 
C 2 (1R 2 ,1R 2 ). So L'Hospital's Rule is applicable and yields 

Z7Qr ,w) • V^(x ,cu) - W(x ,u) ■ U x (x 0l u) 
u^ujo U 2 (xo,u) + W 2 (x ,cj) 

L'Hosp. (t/W^ - WU XUJ ) (xp, uj) + (U W W X - WJJ X ) (Xp, uj) 

*-S> (2UU U + 2WW„) (x , uj) 

if the latter exists. 
We clearly have 

{UW XU> -WU XU1 ) (x ,u>)->0, 

since U{xq,oS) — > U(xq,ujo) = 0, W(xq,u) —> W(x ,uo) = 0, and W XLU 
and Uxu are continuous and thus remain bounded. Furthermore, 

(U U W X - WJJ X ) (x ,uj) (^W, - WJJ X ) (x ,oj ) 

= - det J v (x ,u ) ^ 0, 
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by assumption. Hence the numerator tends to a nonzero number, in 
this case (det Jy(x , w ) < 0) a positive one. For the denominator, we 
find 

(2UU U + 2WW a ) (x , w) = ^-(U 2 + W 2 ) (x ,u>) = h °' l U<UJo 

duj v ' I > 0, if u > uq 

in some neighborhood since the function u (->■ (2UUu + 2WW U ) (xq, 
has a strict local minimum in ojq. At the same time, 

{2UU u + 2WW u ){xo,u))^Q 

for uj — > uq, hence the denominator goes to zero from below for u f uq 
and from above for oj \, uiq. The claim follows. 

The case det Jy(xo, w ) > is completely analogous. □ 

For convergence along a horizontal path, we need slightly more reg- 
ularity: 

Theorem 5.9 (Phase derivatives of the STFT, part II). Let f,g £ 
L 2 (R). Assume that 

• V(f, g) = V = U + i-W £ C 3 (R 2 , M 2 ) 

• V(xo,u ) = 

• det Jv(xo,ujq) 7^ 0, where J is the Jacobian as in Theorem \5.8\ 

Denote if) (x, u) = arg (V(f, g) (x, u)). Then the phase derivative of the 
STFT satisfies 

dip , 

lim — — (x,w ) — c £ R, if x —> x 

x->x ox 

converges to some real number c £ M. 



Proof. The assumptions allow to apply L'Hospital's Rule twice, giving 
dip U(x,u) ) ■ W x {x,Uq) - W(x,uj ) ■ U x (x,u) ) 

llm ~a~\ x ^Q) — nm TrTi \ , n 

x->x 0x e-Kcq U Z {X,U ) + W z \X,U)o) 



L'Hosp. (*/W xx - ^Un) (a;, wo) 



t-Kco (2UU X + 2WW X ) (x, u ) 
L'Hosp. .. (UxWxx + UW XXX - W X U XX - WU XXX ) {x,u Q ) 
*™ 2 {U% + UU XX + HZ, 2 + (x, w ) 

if the latter limit exists. For the denominator, 

[UU XX + WW XX ) (x, u ) -> + Wu.) (x 0) ^o) = 

for x — > xq, but 

K + W 2 ) (x, o; ) -+ (Ul + W 2 ) (x , wo) > 
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converges to a nonzero number, since not both U x (x , u ) and W x (x , oo ) 
can be zero because of det Jy(x , w ) = {U X W U — UuW x ) (x , u ) ^ 0. 
The numerator obviously converges: 

(U X W XX + UW XXX - W X U XX - WU XXX ) (x, u ) 
-> (U X W XX - W X U XX ) (so,u ) G R, 



thus 



,. dip (U x W xx -W x U xx )(x ,u ) 

hm — (x,w ) = — - , rr2 2 . p r =: c € 



□ 



Concerning the partial derivatives of the phase of the STFT with 
respect to the second variable ('frequency'), we can argue almost iden- 
tically and thus find the following analogous results: 

Theorem 5.10 (Phase derivatives of the STFT, part III). Let f,g G 

L 2 (R) and set V(f, g) = V = U + i ■ W . Assume that 

• V(x ,uj ) = 

• det Jv(x , ooo) 

Let V G C 2 (R 2 , R 2 ) and ip (x, u) = arg (V(f, g) (x, u)), then 

,. dip, . { — oo, if x — >■ xq from the left 
hm — — yx, Uq) = < 

x^x doj ' I +oo, if xq ^ x from the right 

if det Jv(xo, ojq) > 0, respectively 

dip I +oo, if x — > x from the left 

hm —{x,u ) = < . 
x-^xo acu I — oo, if Xq <— x from the right 

if det Jv{xq,u ) < 0. 

Lei 1/ G C 3 (IR 2 ,R 2 ). Then the phase derivative converges to some real 
number d G R, i.e. 

dip 

lim — — (xq,u) — c' G M, i/o; — >• c^o- 
w-»-wo aw 
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